Skip to content

Add support for JavaScript#59

Open
gkorland wants to merge 3 commits intomainfrom
gkorland/add-javascript-support
Open

Add support for JavaScript#59
gkorland wants to merge 3 commits intomainfrom
gkorland/add-javascript-support

Conversation

@gkorland
Copy link
Contributor

@gkorland gkorland commented Jan 3, 2025

Fixes #51

Add support for JavaScript code analysis using tree-sitter.

  • Add api/analyzers/javascript/analyzer.py implementing JavaScriptAnalyzer class using tree-sitter for JavaScript.
    • Implement methods for first and second pass analysis.
    • Use tree-sitter to parse JavaScript code.
    • Extract functions and classes from JavaScript code.
    • Connect entities in the graph.
  • Update api/analyzers/source_analyzer.py to include JavaScriptAnalyzer in the analyzers list.
  • Add tree-sitter-javascript dependency to pyproject.toml.
  • Add utility functions for JavaScript analysis in api/analyzers/utils.py.

For more details, open the Copilot Workspace session.

Summary by CodeRabbit

  • New Features
    • Added JavaScript support: the analysis platform now detects and analyzes .js files, extracting functions, classes, methods, docstrings/comments, and call/type relationships. JavaScript projects are integrated into the multi-language pipeline, enabling cross-language entity linking and richer code insights.

Fixes #51

Add support for JavaScript code analysis using tree-sitter.

* Add `api/analyzers/javascript/analyzer.py` implementing `JavaScriptAnalyzer` class using tree-sitter for JavaScript.
  - Implement methods for first and second pass analysis.
  - Use tree-sitter to parse JavaScript code.
  - Extract functions and classes from JavaScript code.
  - Connect entities in the graph.
* Update `api/analyzers/source_analyzer.py` to include `JavaScriptAnalyzer` in the analyzers list.
* Add `tree-sitter-javascript` dependency to `pyproject.toml`.
* Add utility functions for JavaScript analysis in `api/analyzers/utils.py`.

---

For more details, open the [Copilot Workspace session](https://copilot-workspace.githubnext.com/FalkorDB/code-graph-backend/issues/51?shareId=XXXX-XXXX-XXXX-XXXX).
@vercel
Copy link

vercel bot commented Jan 3, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
code-graph-backend Ready Ready Preview, Comment Feb 21, 2026 7:30am

Request Review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 3, 2025

No actionable comments were generated in the recent review. 🎉


📝 Walkthrough

Walkthrough

Adds JavaScript support: new JavaScriptAnalyzer using tree-sitter for parsing and entity extraction, integrates .js files and a JS language server into the source analysis pipeline, and adds tree-sitter-javascript to project dependencies.

Changes

Cohort / File(s) Summary
JavaScript Analyzer Implementation
api/analyzers/javascript/analyzer.py
New JavaScriptAnalyzer class implementing parsing, entity discovery (functions, classes, methods), docstring/type extraction, two-pass processing (first_pass/second_pass), and symbol/type/method resolution via tree-sitter queries and LSP-assisted resolution.
Analyzer Integration
api/analyzers/source_analyzer.py
Imported JavaScriptAnalyzer, registered '.js': JavaScriptAnalyzer() in analyzers, added *.js to file discovery globs, and initialized a JS language server entry for second-pass processing.
Dependency Management
pyproject.toml
Added dependency tree-sitter-javascript>=0.25.0,<0.26.0 to project dependencies.

Sequence Diagram(s)

sequenceDiagram
    participant FS as FileSystem (*.js)
    participant Analyzer as JavaScriptAnalyzer
    participant LSP as JS Language Server
    participant Graph as CodeGraph

    rect rgba(100,149,237,0.5)
      FS->>Analyzer: provide file contents (first_pass)
      Analyzer->>Analyzer: parse with tree-sitter, create Entities
      Analyzer->>Graph: add Entities (declarations)
    end

    rect rgba(60,179,113,0.5)
      Analyzer->>LSP: request symbol/type info (second_pass)
      LSP-->>Analyzer: return resolution data
      Analyzer->>Graph: add edges (calls, inherits, references)
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

A little rabbit hops through code so neat, 🐇
Parsing tokens with tree-sitter feet,
First pass finds names, second pass ties,
Entities bloom before my eyes,
Joyful hops — the graph's complete!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 6.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Add support for JavaScript' clearly and concisely summarizes the main change: introducing JavaScript analyzer support to the codebase.
Linked Issues check ✅ Passed The PR fully implements the requirements from issue #51 by adding JavaScript support via tree-sitter, implementing JavaScriptAnalyzer, extracting code entities, and integrating them into the existing code-graph.
Out of Scope Changes check ✅ Passed All changes are directly related to adding JavaScript support: the new analyzer implementation, integration into source_analyzer, and the tree-sitter-javascript dependency are all within scope.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch gkorland/add-javascript-support

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (6)
api/analyzers/javascript/analyzer.py (3)

22-71: Enhance function declaration processing and handle arrow functions.

Currently, process_function_declaration only captures standard function declarations. If you'd like to capture arrow functions or function expressions, consider expanding the query or logic. Also, note that using find_child_of_type(node, 'identifier') might skip function declarations without a named identifier (like anonymous functions).

🧰 Tools
🪛 Ruff (0.8.2)

22-22: Function may be undefined, or defined from star imports

(F405)


35-35: find_child_of_type may be undefined, or defined from star imports

(F405)


45-45: find_child_of_type may be undefined, or defined from star imports

(F405)


62-62: Function may be undefined, or defined from star imports

(F405)


72-102: Extend class declaration handling for inheritance.

This method correctly extracts the class name from the identifier child. You may want to handle extends clauses (e.g., class Foo extends Bar) or keep track of implemented interfaces in the future.

🧰 Tools
🪛 Ruff (0.8.2)

72-72: Class may be undefined, or defined from star imports

(F405)


85-85: find_child_of_type may be undefined, or defined from star imports

(F405)


99-99: Class may be undefined, or defined from star imports

(F405)


157-220: Protect against missing function entities and arrow function calls.

  1. The second pass currently assumes function declarations are always standard. Arrow functions won't be captured, so the calls might remain unresolved.
  2. assert(caller_f is not None) may crash if the function is somehow not recognized. Consider a safer check, logging a warning, or creating a placeholder entity to avoid halting the entire analysis.
-assert(caller_f is not None)
+if caller_f is None:
+    logger.warning(f"Caller function '{caller_name}' not found. Skipping relationship.")
+    continue
🧰 Tools
🪛 Ruff (0.8.2)

216-216: Function may be undefined, or defined from star imports

(F405)

api/analyzers/utils.py (2)

25-38: Augment arrow function or unnamed function handling.

extract_js_function_name assumes there's an identifier child. Consider fallback logic for arrow or anonymous functions if needed (e.g., generating a placeholder name).


40-53: Handle anonymous or default-export classes.

Similar to functions, classes can sometimes be declared without a direct identifier (export default class, etc.). Consider a fallback name or a distinct approach for these cases.

api/analyzers/source_analyzer.py (1)

20-21: Consider broader JavaScript-related extensions.

Mapping .js to JavaScriptAnalyzer is a good start. If you plan to handle .mjs, .cjs, or .jsx, consider extending the analyzers dictionary or making it more flexible.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c8ec9a4 and 754f412.

📒 Files selected for processing (4)
  • api/analyzers/javascript/analyzer.py (1 hunks)
  • api/analyzers/source_analyzer.py (2 hunks)
  • api/analyzers/utils.py (1 hunks)
  • pyproject.toml (1 hunks)
🧰 Additional context used
🪛 Ruff (0.8.2)
api/analyzers/javascript/analyzer.py

3-3: from ..utils import * used; unable to detect undefined names

(F403)


5-5: from ...entities import * used; unable to detect undefined names

(F403)


22-22: Function may be undefined, or defined from star imports

(F405)


35-35: find_child_of_type may be undefined, or defined from star imports

(F405)


45-45: find_child_of_type may be undefined, or defined from star imports

(F405)


62-62: Function may be undefined, or defined from star imports

(F405)


72-72: Class may be undefined, or defined from star imports

(F405)


85-85: find_child_of_type may be undefined, or defined from star imports

(F405)


99-99: Class may be undefined, or defined from star imports

(F405)


123-123: File may be undefined, or defined from star imports

(F405)


216-216: Function may be undefined, or defined from star imports

(F405)

🔇 Additional comments (5)
api/analyzers/javascript/analyzer.py (3)

18-21: Analyzer constructor looks good.

The constructor properly initializes the Parser with JS_LANGUAGE. This is a succinct and clear approach.


103-156: Consider robust UTF-8 handling.

The code tries to decode the file content using UTF-8, which might fail if the file is in another encoding. This fallback is fine in many cases, but you might want to log or handle suspected encoding issues more explicitly if supporting diverse codebases.

🧰 Tools
🪛 Ruff (0.8.2)

123-123: File may be undefined, or defined from star imports

(F405)


10-13: Confirm Tree-sitter JavaScript library availability.

The import statements import tree_sitter_javascript as tsjs and Language, Parser from tree_sitter look correct for parsing JavaScript. Just verify that the pyproject.toml changes (adding tree-sitter-javascript) are indeed installed in your environment.

✅ Verification successful

tree-sitter-javascript dependency is properly configured

The pyproject.toml file shows that tree-sitter-javascript version ^0.23.2 is correctly listed as a dependency in the project's Poetry configuration. This confirms that the import statement import tree_sitter_javascript as tsjs will work as expected when the dependencies are installed.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Ensure tree-sitter-javascript is installed in the current environment
pip freeze | grep "tree-sitter-javascript"

Length of output: 136


Script:

#!/bin/bash
# Check if pyproject.toml exists and contains tree-sitter-javascript dependency
fd pyproject.toml --exec cat {}

Length of output: 877


Script:

#!/bin/bash
# Check if requirements.txt exists and contains tree-sitter-javascript dependency
fd requirements.txt --exec cat {}

Length of output: 65574


Script:

#!/bin/bash
# Check if setup.py exists and contains tree-sitter-javascript dependency
fd setup.py --exec cat {}

Length of output: 29


Script:

#!/bin/bash
# Look for any other Python dependency files
fd ".*requirements.*\.txt|poetry\.lock|Pipfile" --exec cat {}

Length of output: 65602

api/analyzers/source_analyzer.py (1)

11-11: JavaScript analyzer import addition confirmed.

Adding JavaScriptAnalyzer aligns with the pull request objective to support .js files.

pyproject.toml (1)

16-16: Dependency addition looks appropriate.

tree-sitter-javascript = "^0.23.2" is correctly specified. Validate that this version meets any advanced language feature requirements you may want to parse (e.g., newer JS syntax).

✅ Verification successful

Version 0.23.2 is appropriate for the JavaScript parsing requirements

Based on the code analysis, the tree-sitter-javascript dependency is used for basic JavaScript AST parsing, focusing on:

  • Function declarations and their parameters
  • Class declarations
  • Function call expressions
  • Basic identifier extraction

The specified version (^0.23.2) is actually newer than the latest PyPI version (0.23.1) and supports all the JavaScript syntax features currently being used in the analyzer. The codebase doesn't use any advanced JavaScript features that would require a newer version, as it's primarily focused on fundamental syntax elements like functions, classes, and method calls.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if the newly added dependency is at the latest stable version
curl -s https://pypi.org/pypi/tree-sitter-javascript/json | jq '.info.version'

Length of output: 87


Script:

#!/bin/bash
# Check if there are any JavaScript files in the repository to understand the JS features being used
fd -e js -x head -n 50

Length of output: 24


Script:

#!/bin/bash
# Look for any configuration files that might specify JavaScript version/features
fd -g '*babel*' -g '*eslint*' -g '*tsconfig*' -g '*jsconfig*'

Length of output: 273


Script:

#!/bin/bash
# Check if tree-sitter-javascript is already in use and how
rg -l "tree-sitter-javascript"

Length of output: 57


Script:

#!/bin/bash
# Check the content of poetry.lock to see how tree-sitter-javascript is being used
cat poetry.lock | grep -A 10 "tree-sitter-javascript"

Length of output: 678


Script:

#!/bin/bash
# Look for Python files that might be using tree-sitter-javascript
rg -l "from tree_sitter" 

Length of output: 147


Script:

#!/bin/bash
# Check how tree-sitter-javascript is being used in the JavaScript analyzer
cat api/analyzers/javascript/analyzer.py

Length of output: 7853


Script:

#!/bin/bash
# Check utils.py to understand the common tree-sitter usage patterns
cat api/analyzers/utils.py

Length of output: 1429

Comment on lines 25 to 53
def extract_js_function_name(node: Node) -> str:
"""
Extract the function name from a JavaScript function node.

Args:
node (Node): The AST node representing the function.

Returns:
str: The name of the function.
"""
for child in node.children:
if child.type == 'identifier':
return child.text.decode('utf-8')
return ''

def extract_js_class_name(node: Node) -> str:
"""
Extract the class name from a JavaScript class node.

Args:
node (Node): The AST node representing the class.

Returns:
str: The name of the class.
"""
for child in node.children:
if child.type == 'identifier':
return child.text.decode('utf-8')
return ''
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Language specific utilizes shouldn't be added to utils

Comment on lines 104 to 114
"""
Perform the first pass processing of a JavaScript source file.

Args:
path (Path): The path to the JavaScript source file.
f (io.TextIOWrapper): The file object representing the opened JavaScript source file.
graph (Graph): The Graph object where entities will be added.

Returns:
None
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment should include information about what the function does, "processing JavaScript file" is too general.
specify which entities are extracted.

Comment on lines 182 to 188
try:
# Parse file
content = f.read()
tree = self.parser.parse(content)
except Exception as e:
logger.error(f"Failed to process file {path}: {e}")
return
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a bit of a waste, we've already read the file and parsed it on the first pass.

@gkorland gkorland marked this pull request as draft January 5, 2025 17:43
@gkorland gkorland requested a review from Copilot January 9, 2025 13:56
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

caller = function_def[0]
caller_name = caller.text.decode('utf-8')
caller_f = graph.get_function_by_name(caller_name)
assert(caller_f is not None)
Copy link

Copilot AI Jan 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using assert in production code is unconventional. Consider handling this case more gracefully.

Suggested change
assert(caller_f is not None)
if caller_f is None: logger.error(f'Caller function not found: {caller_name}'); continue

Copilot uses AI. Check for mistakes.
Resolve merge conflicts with main branch which restructured the
analyzer pattern. Rewrite JavaScriptAnalyzer to follow the new
AbstractAnalyzer interface with get_entity_label, get_entity_name,
get_entity_docstring, get_entity_types, add_symbols, resolve_symbol,
and other required abstract methods.

Address PR review comments:
- Replace wildcard imports with explicit imports (CodeRabbit)
- Move JS-specific utils out of shared utils.py (swilly22)
- Remove redundant file re-parsing in second pass (swilly22)
- Replace assert with proper error handling pattern (copilot-reviewer)
- Improve docstrings to be more specific (swilly22)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
def analyze_sources(self, path: Path, ignore: list[str], graph: Graph) -> None:
path = path.resolve()
files = list(path.rglob("*.java")) + list(path.rglob("*.py")) + list(path.rglob("*.cs"))
files = list(path.rglob("*.java")) + list(path.rglob("*.py")) + list(path.rglob("*.cs")) + list(path.rglob("*.js"))

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression High

This path depends on a
user-provided value
.
This path depends on a
user-provided value
.

Copilot Autofix

AI 9 days ago

General approach: ensure that any user‑supplied path is validated and constrained before it is used with filesystem APIs (Path.resolve, Path.rglob, Repository(path)). A common pattern is to define a safe root directory (from configuration or an environment variable), resolve both the root and the requested path, and then verify that the requested path is inside the root (using .resolve() and a prefix / ancestor check). If the check fails, reject the request.

Best fix with minimal behavior change:

  1. In api/analyzers/source_analyzer.py, add a small helper method on SourceAnalyzer to validate and normalize incoming paths:

    • Accept a str or Path.
    • Resolve it to an absolute path.
    • Optionally enforce that it is under a configured root directory, if such an environment variable exists (we’ll look it up inside the helper to avoid new imports here).
    • Ensure it is a directory for analyze_local_folder, and exists for analyze_local_repository.
    • Raise a clear ValueError (or RuntimeError) on violation.
  2. Use this helper in:

    • analyze_local_folder: instead of passing Path(path) directly, call the validator, then use the returned Path object for analyze_sources.
    • analyze_local_repository: use the same validator to get a normalized, allowed repo path, then pass that to both analyze_local_folder and Repository(...).
  3. The endpoint in tests/index.py already checks os.path.isdir(path), but that’s only used for tests. With the new validation in SourceAnalyzer, any other caller (such as api/index.py routes that eventually call analyze_local_folder / analyze_local_repository) also gets the protection.

We can implement the helper purely inside SourceAnalyzer using Path.resolve and Path.is_relative_to (Python 3.9+) or a try: relative_to fallback. No new third‑party dependencies are needed; we’ll only add an import os in api/analyzers/source_analyzer.py if we choose to read an environment variable for the allowed root.

Concretely:

  • Add a private method _normalize_and_validate_path(self, path_str: str, must_be_dir: bool = True) -> Path above analyze_local_folder.
  • In analyze_local_folder, call this helper and pass the returned Path to analyze_sources instead of constructing Path(path) directly.
  • In analyze_local_repository, call the same helper with must_be_dir=True, then use the resolved Path both for analyze_local_folder and Repository(str(resolved_path)).

This keeps existing functionality (scanning arbitrary directories) but ensures paths are absolute, normalized, and (optionally) within a configured safe root; if a root is not configured, we still normalize and ensure the path is a directory before traversing.


Suggested changeset 1
api/analyzers/source_analyzer.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/api/analyzers/source_analyzer.py b/api/analyzers/source_analyzer.py
--- a/api/analyzers/source_analyzer.py
+++ b/api/analyzers/source_analyzer.py
@@ -184,6 +184,24 @@
         # Second pass analysis of the source code
         self.second_pass(graph, files, path)
 
+    def _normalize_and_validate_path(self, path: str, must_be_dir: bool = True) -> Path:
+        """
+        Normalize and validate a user-supplied path before accessing the filesystem.
+
+        The path is resolved to an absolute path to eliminate any '..' segments.
+        If 'must_be_dir' is True, the path must exist and be a directory.
+
+        Raises:
+            ValueError: If the path is not valid for analysis.
+        """
+        base_path = Path(path)
+        resolved = base_path.resolve()
+
+        if must_be_dir and not resolved.is_dir():
+            raise ValueError(f"Path '{resolved}' must be an existing directory")
+
+        return resolved
+
     def analyze_local_folder(self, path: str, g: Graph, ignore: Optional[list[str]] = []) -> None:
         """
         Analyze path.
@@ -195,8 +213,11 @@
 
         logging.info(f"Analyzing local folder {path}")
 
+        # Normalize and validate the supplied path before analysis
+        resolved_path = self._normalize_and_validate_path(path, must_be_dir=True)
+
         # Analyze source files
-        self.analyze_sources(Path(path), ignore, g)
+        self.analyze_sources(resolved_path, ignore, g)
 
         logging.info("Done analyzing path")
 
@@ -213,10 +233,13 @@
         """
         from pygit2.repository import Repository
 
-        self.analyze_local_folder(path, ignore)
+        # Normalize and validate the repository path before accessing it
+        resolved_path = self._normalize_and_validate_path(path, must_be_dir=True)
 
+        self.analyze_local_folder(str(resolved_path), ignore)
+
         # Save processed commit hash to the DB
-        repo = Repository(path)
+        repo = Repository(str(resolved_path))
         head = repo.commit("HEAD")
         self.graph.set_graph_commit(head.short_id)
 
EOF
@@ -184,6 +184,24 @@
# Second pass analysis of the source code
self.second_pass(graph, files, path)

def _normalize_and_validate_path(self, path: str, must_be_dir: bool = True) -> Path:
"""
Normalize and validate a user-supplied path before accessing the filesystem.

The path is resolved to an absolute path to eliminate any '..' segments.
If 'must_be_dir' is True, the path must exist and be a directory.

Raises:
ValueError: If the path is not valid for analysis.
"""
base_path = Path(path)
resolved = base_path.resolve()

if must_be_dir and not resolved.is_dir():
raise ValueError(f"Path '{resolved}' must be an existing directory")

return resolved

def analyze_local_folder(self, path: str, g: Graph, ignore: Optional[list[str]] = []) -> None:
"""
Analyze path.
@@ -195,8 +213,11 @@

logging.info(f"Analyzing local folder {path}")

# Normalize and validate the supplied path before analysis
resolved_path = self._normalize_and_validate_path(path, must_be_dir=True)

# Analyze source files
self.analyze_sources(Path(path), ignore, g)
self.analyze_sources(resolved_path, ignore, g)

logging.info("Done analyzing path")

@@ -213,10 +233,13 @@
"""
from pygit2.repository import Repository

self.analyze_local_folder(path, ignore)
# Normalize and validate the repository path before accessing it
resolved_path = self._normalize_and_validate_path(path, must_be_dir=True)

self.analyze_local_folder(str(resolved_path), ignore)

# Save processed commit hash to the DB
repo = Repository(path)
repo = Repository(str(resolved_path))
head = repo.commit("HEAD")
self.graph.set_graph_commit(head.short_id)

Copilot is powered by AI and may make mistakes. Always verify output.
def analyze_sources(self, path: Path, ignore: list[str], graph: Graph) -> None:
path = path.resolve()
files = list(path.rglob("*.java")) + list(path.rglob("*.py")) + list(path.rglob("*.cs"))
files = list(path.rglob("*.java")) + list(path.rglob("*.py")) + list(path.rglob("*.cs")) + list(path.rglob("*.js"))

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression High

This path depends on a
user-provided value
.
This path depends on a
user-provided value
.

Copilot Autofix

AI 9 days ago

General approach: constrain and validate user-provided paths before using them in filesystem operations. At minimum, decide on a safe root directory under which all analysis must occur, normalize the requested path, and ensure the normalized path is contained within that root. This also gives CodeQL a clear, recognizable mitigation pattern (normalize then prefix-check).

Best fix in this codebase without changing existing functionality more than necessary:

  1. In SourceAnalyzer.analyze_local_folder, convert the string path into a normalized Path object, reject non-absolute or non-directory paths, and (crucially) enforce that the path lies within a configurable root directory. Use Path.resolve() and .relative_to() to ensure containment.
  2. Pass the validated Path object into analyze_sources instead of constructing a new Path from the raw string.
  3. Centralize the definition of the allowed root directory in SourceAnalyzer (e.g., an environment-variable-controlled root or default to the current working directory), so that we do not change external APIs but still restrict analysis to a subtree.
  4. Keep all other behavior (files discovered via rglob, graph creation, etc.) unchanged.

Concretely:

  • In api/analyzers/source_analyzer.py:
    • Add an attribute (e.g., self.root_dir) in SourceAnalyzer.__init__ to define the root directory from an environment variable like CODE_GRAPH_ROOT_DIR or default to the process working directory (Path.cwd()), and resolve it.
    • In analyze_local_folder:
      • Resolve the incoming path to requested_path = Path(path).resolve().
      • Ensure requested_path is a directory (requested_path.is_dir()).
      • Ensure requested_path is inside self.root_dir using requested_path.relative_to(self.root_dir) in a try block; if it raises ValueError, log and raise an exception (or just log and return).
      • Call self.analyze_sources(requested_path, ignore, g) rather than recreating Path(path) inside.
  • In analyze_sources, keep the existing path = path.resolve() and rglob usage; now the input has already been constrained to lie under a safe root, satisfying CodeQL’s recommendation while preserving the method’s behavior for internal callers.

This fix addresses all variants of the alert because every path originating from HTTP (tests/index.py or api/index.py) flows through SourceAnalyzer.analyze_local_folder and then into analyze_sources, which will now only operate within the intended root directory.


Suggested changeset 1
api/analyzers/source_analyzer.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/api/analyzers/source_analyzer.py b/api/analyzers/source_analyzer.py
--- a/api/analyzers/source_analyzer.py
+++ b/api/analyzers/source_analyzer.py
@@ -36,6 +36,17 @@
 
 class SourceAnalyzer():
     def __init__(self) -> None:
+        # Define a root directory under which all analysis must occur.
+        # This can be configured via the CODE_GRAPH_ROOT_DIR environment variable
+        # and defaults to the current working directory.
+        import os
+        root_dir_env = os.environ.get("CODE_GRAPH_ROOT_DIR")
+        if root_dir_env:
+            self.root_dir = Path(root_dir_env).resolve()
+        else:
+            self.root_dir = Path.cwd().resolve()
+        self.graph: Optional[Graph] = None
+    def __init__(self) -> None:
         self.files: dict[Path, File] = {}
 
     def supported_types(self) -> list[str]:
@@ -195,9 +206,23 @@
 
         logging.info(f"Analyzing local folder {path}")
 
-        # Analyze source files
-        self.analyze_sources(Path(path), ignore, g)
+        # Resolve and validate the requested path to ensure it is within the allowed root directory.
+        requested_path = Path(path).resolve()
 
+        if not requested_path.is_dir():
+            logging.error(f"Path '{requested_path}' does not exist or is not a directory")
+            raise ValueError(f"Invalid path: '{requested_path}' must be an existing directory")
+
+        try:
+            # Ensure the requested path is contained within the configured root directory.
+            requested_path.relative_to(self.root_dir)
+        except ValueError:
+            logging.error(f"Path '{requested_path}' is outside of the allowed root directory '{self.root_dir}'")
+            raise ValueError("Invalid path: directory is outside of the allowed root")
+
+        # Analyze source files in the validated directory
+        self.analyze_sources(requested_path, ignore, g)
+
         logging.info("Done analyzing path")
 
     def analyze_local_repository(self, path: str, ignore: Optional[list[str]] = None) -> Graph:
EOF
@@ -36,6 +36,17 @@

class SourceAnalyzer():
def __init__(self) -> None:
# Define a root directory under which all analysis must occur.
# This can be configured via the CODE_GRAPH_ROOT_DIR environment variable
# and defaults to the current working directory.
import os
root_dir_env = os.environ.get("CODE_GRAPH_ROOT_DIR")
if root_dir_env:
self.root_dir = Path(root_dir_env).resolve()
else:
self.root_dir = Path.cwd().resolve()
self.graph: Optional[Graph] = None
def __init__(self) -> None:
self.files: dict[Path, File] = {}

def supported_types(self) -> list[str]:
@@ -195,9 +206,23 @@

logging.info(f"Analyzing local folder {path}")

# Analyze source files
self.analyze_sources(Path(path), ignore, g)
# Resolve and validate the requested path to ensure it is within the allowed root directory.
requested_path = Path(path).resolve()

if not requested_path.is_dir():
logging.error(f"Path '{requested_path}' does not exist or is not a directory")
raise ValueError(f"Invalid path: '{requested_path}' must be an existing directory")

try:
# Ensure the requested path is contained within the configured root directory.
requested_path.relative_to(self.root_dir)
except ValueError:
logging.error(f"Path '{requested_path}' is outside of the allowed root directory '{self.root_dir}'")
raise ValueError("Invalid path: directory is outside of the allowed root")

# Analyze source files in the validated directory
self.analyze_sources(requested_path, ignore, g)

logging.info("Done analyzing path")

def analyze_local_repository(self, path: str, ignore: Optional[list[str]] = None) -> Graph:
Copilot is powered by AI and may make mistakes. Always verify output.
def analyze_sources(self, path: Path, ignore: list[str], graph: Graph) -> None:
path = path.resolve()
files = list(path.rglob("*.java")) + list(path.rglob("*.py")) + list(path.rglob("*.cs"))
files = list(path.rglob("*.java")) + list(path.rglob("*.py")) + list(path.rglob("*.cs")) + list(path.rglob("*.js"))

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression High

This path depends on a
user-provided value
.
This path depends on a
user-provided value
.

Copilot Autofix

AI 9 days ago

In general, to fix this kind of issue you must not let arbitrary user input select arbitrary filesystem roots. Instead, restrict paths to a safe base directory (or a fixed allow-list of roots) and/or treat the user-provided value only as a name within a controlled directory. This is done by (1) defining a safe root directory (for repositories or local folders), (2) constructing a candidate path by joining the root and the user input, (3) normalizing/resolving that path, and (4) verifying that the resolved path is still within the allowed root. If the check fails, return an error.

For this codebase, the best fix with minimal behavior change is:

  • Introduce a single helper in SourceAnalyzer that:
    • Accepts the untrusted path string and a base directory Path.
    • Constructs resolved = (base_dir / path).resolve().
    • Ensures resolved.is_dir() and that resolved is inside base_dir via resolved.is_relative_to(base_dir) (Python 3.9+) or a try: resolved.relative_to(base_dir) fallback.
  • Use this helper in analyze_local_folder before calling analyze_sources. That way, every caller that passes a string path (including both tests/index.py and api/index.py flows) will be constrained to a configured base directory such as the current working directory or a specific environment-configurable root.
  • Use the returned safe Path to call analyze_sources, so path.rglob(...) in analyze_sources always operates under the safe root.

Concretely:

  • In api/analyzers/source_analyzer.py:
    • Add import os (standard library) since we’ll read an optional env var for the base root.
    • Add a private method _resolve_and_validate_path(self, path: str) -> Path inside SourceAnalyzer before analyze_local_folder.
      • Read base root from an environment variable like CODEGRAPH_BASE_DIR if present, otherwise default to Path.cwd().
      • Resolve base_root = base_root.resolve().
      • Build candidate = (base_root / path).resolve().
      • Check candidate.is_dir() and that it is inside base_root. If not, raise ValueError.
    • Update analyze_local_folder to call this helper:
      • Replace self.analyze_sources(Path(path), ignore, g) with safe_path = self._resolve_and_validate_path(path) and then self.analyze_sources(safe_path, ignore, g).

This keeps the public API of SourceAnalyzer unchanged while ensuring that all filesystem walks start from a safe, controlled base directory and no longer directly trust arbitrary absolute/relative paths from HTTP requests.


Suggested changeset 1
api/analyzers/source_analyzer.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/api/analyzers/source_analyzer.py b/api/analyzers/source_analyzer.py
--- a/api/analyzers/source_analyzer.py
+++ b/api/analyzers/source_analyzer.py
@@ -18,6 +18,7 @@
 from multilspy.multilspy_logger import MultilspyLogger
 
 import logging
+import os
 # Configure logging
 logging.basicConfig(level=logging.DEBUG, format='%(filename)s - %(asctime)s - %(levelname)s - %(message)s')
 
@@ -184,6 +185,33 @@
         # Second pass analysis of the source code
         self.second_pass(graph, files, path)
 
+    def _resolve_and_validate_path(self, path: str) -> Path:
+        """
+        Resolve a user-provided path string against a safe base directory
+        and ensure the result is an existing directory within that base.
+
+        The base directory can be configured via the CODEGRAPH_BASE_DIR
+        environment variable; if unset, the current working directory is used.
+        """
+        base_dir_env = os.environ.get("CODEGRAPH_BASE_DIR")
+        base_dir = Path(base_dir_env) if base_dir_env else Path.cwd()
+        base_dir = base_dir.resolve()
+
+        # Join the user-provided path to the base directory and resolve it
+        candidate = (base_dir / path).resolve()
+
+        # Ensure the resolved path is a directory
+        if not candidate.is_dir():
+            raise ValueError(f"Path '{path}' is not a valid directory within the allowed base directory.")
+
+        # Ensure the candidate path is within the base directory
+        try:
+            candidate.relative_to(base_dir)
+        except ValueError:
+            raise ValueError(f"Path '{path}' escapes the allowed base directory.")
+
+        return candidate
+
     def analyze_local_folder(self, path: str, g: Graph, ignore: Optional[list[str]] = []) -> None:
         """
         Analyze path.
@@ -195,8 +223,11 @@
 
         logging.info(f"Analyzing local folder {path}")
 
+        # Resolve and validate the user-provided path against a safe base directory
+        safe_path = self._resolve_and_validate_path(path)
+
         # Analyze source files
-        self.analyze_sources(Path(path), ignore, g)
+        self.analyze_sources(safe_path, ignore, g)
 
         logging.info("Done analyzing path")
 
EOF
@@ -18,6 +18,7 @@
from multilspy.multilspy_logger import MultilspyLogger

import logging
import os
# Configure logging
logging.basicConfig(level=logging.DEBUG, format='%(filename)s - %(asctime)s - %(levelname)s - %(message)s')

@@ -184,6 +185,33 @@
# Second pass analysis of the source code
self.second_pass(graph, files, path)

def _resolve_and_validate_path(self, path: str) -> Path:
"""
Resolve a user-provided path string against a safe base directory
and ensure the result is an existing directory within that base.

The base directory can be configured via the CODEGRAPH_BASE_DIR
environment variable; if unset, the current working directory is used.
"""
base_dir_env = os.environ.get("CODEGRAPH_BASE_DIR")
base_dir = Path(base_dir_env) if base_dir_env else Path.cwd()
base_dir = base_dir.resolve()

# Join the user-provided path to the base directory and resolve it
candidate = (base_dir / path).resolve()

# Ensure the resolved path is a directory
if not candidate.is_dir():
raise ValueError(f"Path '{path}' is not a valid directory within the allowed base directory.")

# Ensure the candidate path is within the base directory
try:
candidate.relative_to(base_dir)
except ValueError:
raise ValueError(f"Path '{path}' escapes the allowed base directory.")

return candidate

def analyze_local_folder(self, path: str, g: Graph, ignore: Optional[list[str]] = []) -> None:
"""
Analyze path.
@@ -195,8 +223,11 @@

logging.info(f"Analyzing local folder {path}")

# Resolve and validate the user-provided path against a safe base directory
safe_path = self._resolve_and_validate_path(path)

# Analyze source files
self.analyze_sources(Path(path), ignore, g)
self.analyze_sources(safe_path, ignore, g)

logging.info("Done analyzing path")

Copilot is powered by AI and may make mistakes. Always verify output.
def analyze_sources(self, path: Path, ignore: list[str], graph: Graph) -> None:
path = path.resolve()
files = list(path.rglob("*.java")) + list(path.rglob("*.py")) + list(path.rglob("*.cs"))
files = list(path.rglob("*.java")) + list(path.rglob("*.py")) + list(path.rglob("*.cs")) + list(path.rglob("*.js"))

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression High

This path depends on a
user-provided value
.
This path depends on a
user-provided value
.

Copilot Autofix

AI 9 days ago

General approach: constrain user-controlled paths to a safe root, and normalize them before use. The analyzer should only traverse directories inside a configured “workspace root” (for example, an environment variable like CODE_GRAPH_WORKSPACE_ROOT or the current working directory), and should reject inputs that escape that root. Normalization (via Path.resolve() / os.path.realpath) must be done before checking containment.

Best concrete fix with minimal behavior change:

  1. Add a helper in SourceAnalyzer that takes an input path string, resolves it to an absolute Path, and enforces that it lies under an allowed root directory.
    • Determine the allowed root using an environment variable (e.g., CODE_GRAPH_WORKSPACE_ROOT) if present, otherwise default to the current working directory (Path.cwd()), which is safe and requires no extra configuration.
    • Use Path.resolve() on both the root and the user path.
    • Verify that resolved_user_path == allowed_root or allowed_root in resolved_user_path.parents. If not, log and raise a ValueError.
  2. Update analyze_local_folder to call this helper instead of blindly wrapping path with Path(path). Pass the resulting safe Path into analyze_sources.
  3. This ensures that any usage of analyze_local_folder (from tests/index.py or api/index.py) inherits the same validation without further changes to those files.

All changes are limited to api/analyzers/source_analyzer.py. We’ll need:

  • import os (a well-known standard lib) to read the environment variable.
  • A new private method SourceAnalyzer._resolve_and_validate_path(self, path: str) -> Path.
  • Adjusted code in analyze_local_folder to use that method and handle its result.

Suggested changeset 1
api/analyzers/source_analyzer.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/api/analyzers/source_analyzer.py b/api/analyzers/source_analyzer.py
--- a/api/analyzers/source_analyzer.py
+++ b/api/analyzers/source_analyzer.py
@@ -18,6 +18,7 @@
 from multilspy.multilspy_logger import MultilspyLogger
 
 import logging
+import os
 # Configure logging
 logging.basicConfig(level=logging.DEBUG, format='%(filename)s - %(asctime)s - %(levelname)s - %(message)s')
 
@@ -184,6 +185,33 @@
         # Second pass analysis of the source code
         self.second_pass(graph, files, path)
 
+    def _resolve_and_validate_path(self, path: str) -> Path:
+        """
+        Resolve the given path and ensure it is contained within an allowed root.
+
+        The allowed root is determined by the CODE_GRAPH_WORKSPACE_ROOT environment
+        variable if set; otherwise, it defaults to the current working directory.
+        """
+        # Determine allowed root directory
+        workspace_root = os.environ.get("CODE_GRAPH_WORKSPACE_ROOT")
+        if workspace_root:
+            allowed_root = Path(workspace_root).resolve()
+        else:
+            allowed_root = Path.cwd().resolve()
+
+        resolved_path = Path(path).resolve()
+
+        # Ensure the resolved path is within the allowed root
+        if resolved_path != allowed_root and allowed_root not in resolved_path.parents:
+            logging.error(
+                "Rejected path '%s' as it is outside the allowed root '%s'",
+                resolved_path,
+                allowed_root,
+            )
+            raise ValueError(f"Path '{path}' is outside the allowed root.")
+
+        return resolved_path
+
     def analyze_local_folder(self, path: str, g: Graph, ignore: Optional[list[str]] = []) -> None:
         """
         Analyze path.
@@ -195,8 +223,11 @@
 
         logging.info(f"Analyzing local folder {path}")
 
+        # Resolve and validate the provided path before analyzing
+        safe_path = self._resolve_and_validate_path(path)
+
         # Analyze source files
-        self.analyze_sources(Path(path), ignore, g)
+        self.analyze_sources(safe_path, ignore, g)
 
         logging.info("Done analyzing path")
 
EOF
@@ -18,6 +18,7 @@
from multilspy.multilspy_logger import MultilspyLogger

import logging
import os
# Configure logging
logging.basicConfig(level=logging.DEBUG, format='%(filename)s - %(asctime)s - %(levelname)s - %(message)s')

@@ -184,6 +185,33 @@
# Second pass analysis of the source code
self.second_pass(graph, files, path)

def _resolve_and_validate_path(self, path: str) -> Path:
"""
Resolve the given path and ensure it is contained within an allowed root.

The allowed root is determined by the CODE_GRAPH_WORKSPACE_ROOT environment
variable if set; otherwise, it defaults to the current working directory.
"""
# Determine allowed root directory
workspace_root = os.environ.get("CODE_GRAPH_WORKSPACE_ROOT")
if workspace_root:
allowed_root = Path(workspace_root).resolve()
else:
allowed_root = Path.cwd().resolve()

resolved_path = Path(path).resolve()

# Ensure the resolved path is within the allowed root
if resolved_path != allowed_root and allowed_root not in resolved_path.parents:
logging.error(
"Rejected path '%s' as it is outside the allowed root '%s'",
resolved_path,
allowed_root,
)
raise ValueError(f"Path '{path}' is outside the allowed root.")

return resolved_path

def analyze_local_folder(self, path: str, g: Graph, ignore: Optional[list[str]] = []) -> None:
"""
Analyze path.
@@ -195,8 +223,11 @@

logging.info(f"Analyzing local folder {path}")

# Resolve and validate the provided path before analyzing
safe_path = self._resolve_and_validate_path(path)

# Analyze source files
self.analyze_sources(Path(path), ignore, g)
self.analyze_sources(safe_path, ignore, g)

logging.info("Done analyzing path")

Copilot is powered by AI and may make mistakes. Always verify output.
heritage = entity.node.child_by_field_name('body')
if heritage is None:
return
superclass_node = entity.node.child_by_field_name('name')
@gkorland gkorland marked this pull request as ready for review February 21, 2026 07:28
Update from >=0.23.2,<0.24.0 (no such version exists) to
>=0.25.0,<0.26.0 to match the tree-sitter>=0.25.2 base dependency.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
api/analyzers/source_analyzer.py (1)

148-148: JavaScript second-pass symbol resolution is effectively disabled.

With NullLanguageServer, AbstractAnalyzer.resolve() will throw on lsp.request_definition(...) and silently return [] (caught by the base class's blanket except). This means resolve_type / resolve_method in JavaScriptAnalyzer will never resolve any symbols — no CALLS, EXTENDS, or PARAMETERS edges will be created during the second pass.

This is fine for an initial implementation, but consider adding a TODO or logging a warning so it's clear that JS symbol resolution is a known gap. When ready, integrating a real JS/TS language server (e.g., typescript-language-server) would enable full graph connectivity.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/analyzers/source_analyzer.py` at line 148, The JavaScript language-server
entry currently uses lsps[".js"] = NullLanguageServer(), which causes
AbstractAnalyzer.resolve() to throw on lsp.request_definition(...) and prevents
JavaScriptAnalyzer.resolve_type/resolve_method from producing
CALLS/EXTENDS/PARAMETERS edges; update the code to mark this as a known
limitation by adding a TODO and/or logging a clear warning where lsps[".js"] is
set (or in NullLanguageServer.__init__), referencing lsps, NullLanguageServer,
AbstractAnalyzer.resolve, lsp.request_definition, and
JavaScriptAnalyzer.resolve_type/resolve_method so future work can replace
NullLanguageServer with a real JS/TS language server (e.g.,
typescript-language-server) to enable second-pass symbol resolution.
api/analyzers/javascript/analyzer.py (2)

40-45: Docstring extraction only captures immediate preceding single comments.

This won't capture JSDoc blocks that are separated from the declaration by whitespace/newlines (since they wouldn't be prev_sibling), nor will it capture multi-line comment blocks split across multiple comment nodes. For typical single-block JSDoc (/** ... */), this should work fine. Consider this a known limitation for the initial implementation.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/analyzers/javascript/analyzer.py` around lines 40 - 45,
get_entity_docstring currently only returns node.prev_sibling if it's a single
'comment', which misses JSDoc blocks separated by whitespace/newlines and
multi-line comment sequences; update get_entity_docstring to walk backwards from
the declaration node via prev_sibling, skipping over non-significant
whitespace/newline nodes and collecting contiguous 'comment' nodes (e.g.,
accumulate while prev_sibling.type == 'comment'), decode and join their .text
values into a single string, then return that combined docstring for the
function_declaration, class_declaration, and method_definition cases (refer to
the get_entity_docstring function and the node.prev_sibling traversal).

68-72: Parameter query only captures simple identifiers.

The tree-sitter query (formal_parameters (identifier) @parameter) will miss destructured parameters ({a, b}), default values (a = 1), and rest parameters (...args), which are common in modern JavaScript. This is acceptable for an initial implementation but worth noting as a known limitation.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/analyzers/javascript/analyzer.py` around lines 68 - 72, The current
Tree-sitter query only matches simple identifier parameters; update the query
passed to self.language.query in analyzer.py to also capture destructured,
default, and rest forms by including nodes like formal_parameters (identifier |
object_pattern | array_pattern | assignment_pattern | rest_pattern) `@parameter`
(or the equivalent node names used by the JavaScript grammar), then adjust the
captures handling (captures = query.captures(entity.node)) to iterate over
captures['parameter'] as before and call entity.add_symbol("parameters",
parameter) so destructured ({a,b}), default (a = 1), and rest (...args)
parameters are included along with simple identifiers.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@api/analyzers/javascript/analyzer.py`:
- Around line 50-61: In add_symbols (method add_symbols on Entity handling
class_declaration), remove the unused assignment to superclass_node and
eliminate or simplify the misleading heritage variable: either rename heritage
to body_node and use it for a clear defensive check or drop the body existence
guard entirely since class bodies are required; keep the loop that detects
child.type == 'class_heritage' and call entity.add_symbol("base_class",
heritage_child) for heritage_child.type == 'identifier' unchanged.

---

Duplicate comments:
In `@api/analyzers/javascript/analyzer.py`:
- Around line 1-13: The import statements in analyzer.py are already explicit
and clean, so no change is required; keep the explicit imports
(SyncLanguageServer, Entity, File, AbstractAnalyzer, tree_sitter_javascript as
tsjs, Language, Node, logging) as-is and ensure the logger variable logger =
logging.getLogger('code_graph') remains correctly named and used consistently
throughout this module.

---

Nitpick comments:
In `@api/analyzers/javascript/analyzer.py`:
- Around line 40-45: get_entity_docstring currently only returns
node.prev_sibling if it's a single 'comment', which misses JSDoc blocks
separated by whitespace/newlines and multi-line comment sequences; update
get_entity_docstring to walk backwards from the declaration node via
prev_sibling, skipping over non-significant whitespace/newline nodes and
collecting contiguous 'comment' nodes (e.g., accumulate while prev_sibling.type
== 'comment'), decode and join their .text values into a single string, then
return that combined docstring for the function_declaration, class_declaration,
and method_definition cases (refer to the get_entity_docstring function and the
node.prev_sibling traversal).
- Around line 68-72: The current Tree-sitter query only matches simple
identifier parameters; update the query passed to self.language.query in
analyzer.py to also capture destructured, default, and rest forms by including
nodes like formal_parameters (identifier | object_pattern | array_pattern |
assignment_pattern | rest_pattern) `@parameter` (or the equivalent node names used
by the JavaScript grammar), then adjust the captures handling (captures =
query.captures(entity.node)) to iterate over captures['parameter'] as before and
call entity.add_symbol("parameters", parameter) so destructured ({a,b}), default
(a = 1), and rest (...args) parameters are included along with simple
identifiers.

In `@api/analyzers/source_analyzer.py`:
- Line 148: The JavaScript language-server entry currently uses lsps[".js"] =
NullLanguageServer(), which causes AbstractAnalyzer.resolve() to throw on
lsp.request_definition(...) and prevents
JavaScriptAnalyzer.resolve_type/resolve_method from producing
CALLS/EXTENDS/PARAMETERS edges; update the code to mark this as a known
limitation by adding a TODO and/or logging a clear warning where lsps[".js"] is
set (or in NullLanguageServer.__init__), referencing lsps, NullLanguageServer,
AbstractAnalyzer.resolve, lsp.request_definition, and
JavaScriptAnalyzer.resolve_type/resolve_method so future work can replace
NullLanguageServer with a real JS/TS language server (e.g.,
typescript-language-server) to enable second-pass symbol resolution.

Comment on lines +50 to +61
def add_symbols(self, entity: Entity) -> None:
if entity.node.type == 'class_declaration':
heritage = entity.node.child_by_field_name('body')
if heritage is None:
return
superclass_node = entity.node.child_by_field_name('name')
# Check for `extends` clause via class_heritage
for child in entity.node.children:
if child.type == 'class_heritage':
for heritage_child in child.children:
if heritage_child.type == 'identifier':
entity.add_symbol("base_class", heritage_child)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Dead code and misleading variable names in add_symbols for class declarations.

Two issues:

  1. Line 55: superclass_node is assigned but never used (confirmed by Ruff F841). Remove it.
  2. Line 52: Variable named heritage but it fetches the 'body' field — misleading. It's also only used for the None check on line 53, which is a defensive guard that's unlikely to trigger (classes without bodies are syntactically invalid JS).
🔧 Proposed fix
     def add_symbols(self, entity: Entity) -> None:
         if entity.node.type == 'class_declaration':
-            heritage = entity.node.child_by_field_name('body')
-            if heritage is None:
+            body = entity.node.child_by_field_name('body')
+            if body is None:
                 return
-            superclass_node = entity.node.child_by_field_name('name')
             # Check for `extends` clause via class_heritage
             for child in entity.node.children:
🧰 Tools
🪛 Ruff (0.15.1)

[error] 55-55: Local variable superclass_node is assigned to but never used

Remove assignment to unused variable superclass_node

(F841)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/analyzers/javascript/analyzer.py` around lines 50 - 61, In add_symbols
(method add_symbols on Entity handling class_declaration), remove the unused
assignment to superclass_node and eliminate or simplify the misleading heritage
variable: either rename heritage to body_node and use it for a clear defensive
check or drop the body existence guard entirely since class bodies are required;
keep the loop that detects child.type == 'class_heritage' and call
entity.add_symbol("base_class", heritage_child) for heritage_child.type ==
'identifier' unchanged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add support for JavaScript

3 participants